- Go to github.com/jensroes/bristol-ws-2025
- Click on:
Code>Download ZIP> unzip directory on your machine. - Open project by double clicking on
bristol-ws-2025.Rproj
Code > Download ZIP > unzip directory on your machine.bristol-ws-2025.Rproj\[ \begin{aligned} y_i &\sim \mathcal{N}(\mu, \sigma^2) \end{aligned} \]
\[ \begin{aligned} y_i &\sim \mathcal{N}(\mu_i, \sigma^2)\\ \mu_i &= \beta_0 + \beta_1 \cdot x_{1i} \\ \end{aligned} \]
Value of \(\mu_i\) can be decomposed using a linear regression equation.
lm to easily implement such a model.model <- lm(y ~ x, data = data)
rnorm generates unimodal, symmetrically distributed values.r in rnorm = random; norm = normal# Generate data y <- rnorm(n = 10, mean = 550, sd = 10)
rnorm generates unimodal, symmetrically distributed values.r in rnorm = random; norm = normal# Generate data y <- rnorm(n = 100, mean = 550, sd = 10)
rnorm generates unimodal, symmetrically distributed values.r in rnorm = random; norm = normal# Generate data y <- rnorm(n = 1000, mean = 550, sd = 10)
rnorm generates unimodal, symmetrically distributed values.r in rnorm = random; norm = normal# Decompose the mean beta_0 = 500 beta_1 = 50 # Generate data y <- rnorm(n = 1000, mean = beta_0 + beta_1, sd = 10)
# Set parameter values n <- 1000 beta_0 <- 500 beta_1 <- 50 sigma <- 100
# Random data for group 1 group_1 <- rnorm(n = n/2, mean = beta_0, sd = sigma) # Random data for group 2 group_2 <- rnorm(n = n/2, mean = beta_0 + beta_1, sd = sigma)
# Generate data
sim_data <- tibble(group_1 = group_1,
group_2 = group_2)
# Preview data
glimpse(sim_data)
Rows: 500 Columns: 2 $ group_1 <dbl> 558, 400, 470, 685, 448, 397, 456, 525, 530, 356, 440, 493, 49… $ group_2 <dbl> 665, 562, 598, 459, 447, 606, 406, 564, 651, 393, 367, 639, 44…
# Change data format sim_data <- pivot_longer(sim_data, cols = c(group_1, group_2), names_to = "x", values_to = "y") # Preview data glimpse(sim_data)
Rows: 1,000 Columns: 2 $ x <chr> "group_1", "group_2", "group_1", "group_2", "group_1", "group_2", "g… $ y <dbl> 558, 665, 400, 562, 470, 598, 685, 459, 448, 447, 397, 606, 456, 406…
# As a reminder, these are the parameter values beta_0 <- 500 beta_1 <- 50 sigma <- 100
lm uses maximum likelihood estimation to determine for which set of parameter values are the data most likely.
# Normal linear model of simulated data model <- lm(y ~ x, data = sim_data) coef(model) # Model coefficients
(Intercept) xgroup_2
498.6 43.4
sigma(model) # Standard deviation
[1] 103
Check scripts
exercises/normal_model_simulation_1.Rexercises/normal_model_simulation_2.RObserve how changing the number of observations affects the model estimates.
rnorm samples normally distributed datasigma was the same.data <- read_csv("../data/martin-etal-2010-exp3a.csv") %>% glimpse()
Rows: 1,036 Columns: 4 $ item <dbl> 11, 11, 11, 11, 11, 11, 25, 25, 25, 25, 25, 37, 37, 37, 37, 5, … $ ppt <dbl> 10, 9, 11, 6, 7, 2, 10, 9, 11, 7, 2, 10, 9, 6, 2, 10, 9, 11, 6,… $ rt <dbl> 1055, 2010, 461, 977, 1152, 1079, 1194, 1267, 1160, 684, 1367, … $ nptype <chr> "conjoined", "conjoined", "conjoined", "conjoined", "conjoined"…
summarise(data, across(c(ppt, item), list(n = ~length(unique(.))), .names = "{col}"))
# A tibble: 1 × 2
ppt item
<int> <int>
1 12 48
In standard repeated measures ANOVAs only one source of variance can be modelled at a time.
For example one would aggregate across items and by-participant neglecting the by-items variance.
data_ppt <- data %>% summarise(rt = mean(rt), .by = c(nptype, ppt)) data_ppt
# A tibble: 24 × 3 nptype ppt rt <chr> <dbl> <dbl> 1 conjoined 10 944. 2 conjoined 9 1298. 3 conjoined 11 1410. 4 conjoined 6 1000. 5 conjoined 7 922. # ℹ 19 more rows
Each rt observation \(i \in 1 \dots N\) can be said to come a normal distribution with a mean \(\mu\) and a standard deviation \(\sigma^2\)
\[ \text{rt}_i \sim \mathcal{N}(\mu_i, \sigma^2)\\ \]
Using the linear regression equation, the mean \(\mu\) can be decomposed into
$$ _i = _0 + _1 _i + _i\
$$
where \(\text{nptype}_i\) is dummy coded (by default as treatment contrast in alphabetical order)
\[ \text{nptype}_i = \left\{ \begin{array}{ll} 0, \text{ if nptype}_i = \texttt{conjoined}\\ 1, \text{ if nptype}_i = \texttt{simple}\\ \end{array} \right. \]
Because
\(\mu_i = \beta_0 + \beta_1 \cdot 0 = \beta_0\),
\(\beta_0\) is the average rt for conjoined NPs.
Because
\(\mu_i = \beta_0 + \beta_\text{nptype} \cdot 1 = \beta_0 + \beta_1\) is the average rt for simple NPs, and \(\beta_1\) is the difference in rts between conjoined and simple NPs.
and finally \(\text{ppt}_i \sim \mathcal{N}(0, \sigma^2_\text{ppt})\)
library(afex) m <- aov_car(rt ~ Error(ppt/nptype), data = data_ppt) summary(m)
Univariate Type III Repeated-Measures ANOVA Assuming Sphericity
Sum Sq num Df Error SS den Df F value Pr(>F)
(Intercept) 28783750 1 587304 11 539.11 1.1e-10 ***
nptype 6217 1 13043 11 5.24 0.043 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
library(lme4) m <- lmer(rt ~ nptype + (1|ppt), data = data_ppt) anova(m)
Type III Analysis of Variance Table with Satterthwaite's method
Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
nptype 6217 6217 1 11 5.24 0.043 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Complete exercise script linear-models/exercises/repeated_measures_model.R
and after that linear-models/exercises/mixed_effects_model.R
m <- lmer(rt ~ nptype + (1|ppt) + (1|item), data = data)
summary(m)$coef
Estimate Std. Error df t value Pr(>|t|) (Intercept) 1111.8 48.0 11.9 23.2 3.11e-11 nptypesimple -33.4 14.5 977.9 -2.3 2.16e-02
lmer models?m_null <- lmer(rt ~ 1 + ( 1 | ppt ) + ( 1 | item ), data = data) anova(m_null, m)
Data: data
Models:
m_null: rt ~ 1 + (1 | ppt) + (1 | item)
m: rt ~ nptype + (1 | ppt) + (1 | item)
npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
m_null 4 14319 14339 -7155 14311
m 5 14316 14340 -7153 14306 5.28 1 0.022 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
lmerTest)m <- lmerTest::lmer(rt ~ nptype + (1|ppt) + (1|item), data = data) summary(m)$coef
Estimate Std. Error df t value Pr(>|t|) (Intercept) 1111.8 48.0 11.9 23.2 3.11e-11 nptypesimple -33.4 14.5 977.9 -2.3 2.16e-02
confint(m, "nptypesimple")
2.5 % 97.5 % nptypesimple -61.8 -4.93
summary(m)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: rt ~ nptype + (1 | ppt) + (1 | item)
Data: data
REML criterion at convergence: 14289
Scaled residuals:
Min 1Q Median 3Q Max
-4.518 -0.596 -0.140 0.448 4.965
Random effects:
Groups Name Variance Std.Dev.
item (Intercept) 1649 40.6
ppt (Intercept) 25996 161.2
Residual 54536 233.5
Number of obs: 1036, groups: item, 48; ppt, 12
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 1111.8 48.0 11.9 23.1 3.1e-11 ***
nptypesimple -33.4 14.5 977.9 -2.3 0.022 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr)
nptypesimpl -0.151
summary(m)$coef
Estimate Std. Error df t value Pr(>|t|) (Intercept) 1111.8 48.0 11.9 23.2 3.11e-11 nptypesimple -33.4 14.5 977.9 -2.3 2.16e-02
(coefs <- m@beta) # extract estimates
[1] 1111.8 -33.4
# Predicted mean for np conjoined coefs[1] + coefs[2] * 0
[1] 1112
# Predicted mean for np simple coefs[1] + coefs[2] * 1
[1] 1078
# Check predictor contrasts (default: treatment) contrasts(factor(data$nptype))
simple conjoined 0 simple 1
data_pred <- expand.grid(item = 1:48, ppt = 1:12, nptype = unique(data$nptype)) %>%
mutate(pred = predict(m, newdata = .),
m = "random intercepts")
select(data, item, nptype) %>% unique() %>% arrange(item, nptype)
# A tibble: 96 × 2 item nptype <dbl> <chr> 1 1 conjoined 2 1 simple 3 2 conjoined 4 2 simple 5 3 conjoined # ℹ 91 more rows
select(data, ppt, nptype) %>% unique() %>% arrange(ppt, nptype)
# A tibble: 24 × 2
ppt nptype
<dbl> <chr>
1 1 conjoined
2 1 simple
3 2 conjoined
4 2 simple
5 3 conjoined
# ℹ 19 more rows# Maximal random effects structure (with correlation) m_max <- lmer(rt ~ nptype + (nptype|ppt) + (nptype|item), data = data) # Maximal random effects structure (without correlation) m_max_nocor <- lmer(rt ~ nptype + (nptype||ppt) + (nptype||item), data = data)
bind_rows(data_pred, data_pred_cor, data_pred_nocor) %>%
filter(ppt %in% c(7, 11)) %>%
mutate(across(m, ~case_when(str_detect(., "intercepts") ~ "\n(1|ppt) + (1|item)",
str_detect(., "without") ~ "\n(nptype||ppt) + (nptype||ppt)",
str_detect(., "with ") ~ "\n(nptype|ppt) + (nptype|ppt)"))) %>%
rename(`random effects` = m) %>%
ggplot(aes(x = nptype, y = pred, group = item)) +
geom_point(size = .1) +
geom_line(linewidth = .1) +
facet_grid(ppt ~ `random effects`, scales = "free", labeller = label_both) +
labs(y = "predicted rt", caption = "Lines represent different items")
# Maximal random effects structure (with correlation) m_max <- lmer(rt ~ nptype + (nptype|ppt) + (nptype|item), data = data) summary(m_max)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: rt ~ nptype + (nptype | ppt) + (nptype | item)
Data: data
REML criterion at convergence: 14288
Scaled residuals:
Min 1Q Median 3Q Max
-4.505 -0.596 -0.147 0.449 5.002
Random effects:
Groups Name Variance Std.Dev. Corr
item (Intercept) 2749.8 52.44
nptypesimple 522.1 22.85 -1.00
ppt (Intercept) 27159.7 164.80
nptypesimple 52.4 7.24 -1.00
Residual 54359.2 233.15
Number of obs: 1036, groups: item, 48; ppt, 12
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 1111.9 49.3 11.5 22.58 7e-11 ***
nptypesimple -33.6 15.0 125.7 -2.24 0.027 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr)
nptypesimpl -0.310
optimizer (nloptwrap) convergence code: 0 (OK)
boundary (singular) fit: see help('isSingular')
Task: pair designs with their corresponding maximal random-effects structure!
Model_A <- lmer(rt ~ nptype + (1|ppt) + (nptype|item), data = data) Model_B <- lmer(rt ~ nptype + (1|ppt) + (1|item), data = data) Model_C <- lmer(rt ~ nptype + (nptype|ppt) + (nptype|item), data = data) Model_D <- lmer(rt ~ nptype + (nptype|ppt) + (1|item), data = data)
Complete exercise script linear-models/exercises/mixed_effects_model_rirs.R
Type of outcome variable
lme4::lmer(dv ~ iv + (random slopes | random intercepts), data = data)lme4::glmer(dv ~ iv + (random slopes | random intercepts), data = data, family = binomial())lme4::glmer(dv ~ iv + (random slopes | random intercepts), data = data, family = poisson())
lme4::glmer.nb and zero-inflated count models NBZIMM::glmm.zinbordinal::clmm(dv ~ iv + (random slopes | random intercepts), data = data)Frequentist mixed-effects models require familiarity with different packages.
lme4 author Bates proposed ways to deal with convergence failures by using “parsimonious” random effects structures (Bates et al. 2015)Top tip:
brms (Bürkner 2017) using a syntax that roughly mimics lme4brms brmslme4s.gaussian, lognormal, bernoulli, poisson, zero_inflated_poisson, skew_normal, shifted_lognormal, exgaussian, et ceterabrms creates Stan code to compile a probabilistic MCMC (Markov Chain Monte Carlo) sampler.Complete script linear-models/exercises/mixed_effects_model_brms.R
Frequentist mixed-effects models
library(lme4) fit_lmer <- lmer(rt ~ nptype + (nptype|ppt) + (nptype|item), data = data)
Bayesian mixed-effects models
library(brms) fit_brm <- brm(rt ~ nptype + (nptype|ppt) + (nptype|item), data = data)
coef(summary(fit_lmer)) # Coefficients of frequentist model
Estimate Std. Error df t value Pr(>|t|) (Intercept) 1111.9 49.3 11.5 22.58 6.96e-11 nptypesimple -33.6 15.0 125.7 -2.24 2.70e-02
fixef(fit_brm) # Coefficients of Bayesian model
Estimate Est.Error Q2.5 Q97.5 Intercept 1108.0 50.2 1010.1 1206.889 nptypesimple -33.2 17.2 -66.3 0.749
\[ Pr(\text{data} \mid H_0 = \text{TRUE}) \]
\[ Pr(H \mid \text{data}) \]
\[ \underbrace{Pr(\theta \mid \text{data})}_{\text{posterior}} \propto \underbrace{Pr(\theta)}_{\text{prior}} \cdot \underbrace{Pr(\text{data} \mid \theta)}_{\text{likelihood}} \] - Likelihood: probability of data given the model parameter value(s) - Prior: what did we already know about model parameter(s) - Posterior: our updated belief in the parameter value(s) after seeing the data
It tells us what we want to know!
Instead of concluding
\[ p(\text{data} \mid H_0 = \text{TRUE}) \]
we can answer
\[ p(H \mid \text{data}) \]
brms, rstanarm, rethinking).brms uses probabilistic sampling to determine the posterior uncertainty about the parameter value.Progress of probabilistic sampler.
By default: - 2,000 iterations for parameter estimation per chain - iterations / 2 warm-up samples: discarded eventually - 4 chains to establish convergence - Change cores (check parallel::detectCores()) for running chains in parallel. - How many posterior samples have we got for inference (= chains * (iterations - warm-up))? - How many iterations / chains do you need?
sim <- fit_brm$fit@sim
samples <- map(1:4, ~sim$samples[[.]] %>%
as_tibble() %>%
select(starts_with("b")) %>%
mutate(chain = .x,
iteration = 1:n())) %>% bind_rows() %>%
pivot_longer(starts_with("b")) %>%
mutate(name = factor(name, levels = unique(name)[c(1,2)], ordered = T),
chain = factor(chain))
pivot_wider(samples, names_from = name, values_from = value) %>%
filter(chain == 1) %>%
mutate(chain = paste0("Chain: ", chain)) %>%
ggplot(aes(x = b_Intercept, y = b_nptypesimple, colour = chain)) +
scale_color_viridis_d("") +
geom_path(show.legend = F, colour = "white") +
facet_wrap(~chain) +
labs(title = "Parameter space") +
scale_x_continuous(limits = c(-100, 1300), breaks = seq(-150, 1600, 250)) +
scale_y_continuous(limits = c(-150, 100), breaks = seq(-150, 600, 50)) +
theme(legend.justification = "top")
pivot_wider(samples, names_from = name, values_from = value) %>%
filter(chain == 1) %>%
mutate(chain = paste0("Chain: ", chain)) %>%
ggplot(aes(x = b_Intercept, y = b_nptypesimple, colour = chain, alpha = iteration, label = iteration)) +
scale_color_viridis_d("") +
geom_path(show.legend = F) +
geom_text(show.legend = F, size = 2, aes(alpha = -iteration)) +
facet_wrap(~chain) +
labs(title = "Parameter space") +
scale_x_continuous(limits = c(-100, 1300), breaks = seq(-150, 1600, 250)) +
scale_y_continuous(limits = c(-150, 100), breaks = seq(-150, 600, 50)) +
theme(legend.justification = "top")
samples %>%
filter(chain %in% 1, iteration %in% 1:100) %>%
ggplot(aes(y=value, x = iteration, colour = chain)) +
scale_color_viridis_d("Chains") +
geom_path() +
facet_wrap(~name, scales = "free") +
labs(title = "Traceplot")
samples %>%
filter(chain %in% 1:2, iteration %in% 1:100) %>%
ggplot(aes(y=value, x = iteration, colour = chain)) +
scale_color_viridis_d("Chains") +
geom_path() +
facet_wrap(~name, scales = "free") +
labs(title = "Traceplot")
samples %>%
filter(chain %in% 1:4, iteration %in% 1:100) %>%
ggplot(aes(y=value, x = iteration, colour = chain)) +
scale_color_viridis_d("Chains") +
geom_path() +
facet_wrap(~name, scales = "free") +
labs(title = "Traceplot")
samples %>%
filter(chain %in% 1:4, iteration %in% 1:250) %>%
ggplot(aes(y=value, x = iteration, colour = chain)) +
scale_color_viridis_d("Chains") +
geom_path() +
facet_wrap(~name, scales = "free") +
labs(title = "Traceplot")
samples %>%
filter(chain %in% 1:4, iteration %in% 1:500) %>%
ggplot(aes(y=value, x = iteration, colour = chain)) +
scale_color_viridis_d("Chains") +
geom_path() +
facet_wrap(~name, scales = "free") +
labs(title = "Traceplot")
samples %>%
filter(chain %in% 1:4, iteration %in% 1:2000) %>%
ggplot(aes(y=value, x = iteration, colour = chain)) +
scale_color_viridis_d("Chains") +
geom_path() +
facet_wrap(~name, scales = "free") +
labs(title = "Traceplot")
samples %>%
filter(chain %in% 1:4, iteration %in% 1000:2000) %>%
ggplot(aes(y=value, x = iteration, colour = chain)) +
scale_color_viridis_d("Chains") +
geom_path() +
facet_wrap(~name, scales = "free") +
labs(title = "Traceplot")
linear-models/exercises/model_diagnostics.Rtibble(intercept = runif(1000, 0, 5000),
slope = runif(1000, -10000, 10000)) %>%
ggplot(aes(x=intercept, y=slope)) +
labs(subtitle = "Parameter space")
\[ \beta_0 \sim \mathcal{N}(1000 \text{ msecs}, ???) \]
tibble(intercept = runif(1000, 0, 5000),
slope = runif(1000, -10000, 10000)) %>%
ggplot(aes(x=intercept, y=slope)) +
labs(subtitle = "Parameter space")
\[ \beta_0 \sim \mathcal{N}(1000 \text{ msecs}, 150\text{ msecs}) \]
p <- tibble(intercept = rnorm(10000, mean = 1000, sd = 150),
slope = runif(10000, -10000, 10000)) %>%
ggplot(aes(x=intercept, y=slope)) +
geom_point(size = .25, colour = "transparent") +
geom_density_2d_filled(alpha = 0.25, show.legend = F) +
geom_density_2d(alpha = 0.5, show.legend = F, size = .15, color = "black") +
scale_fill_viridis_d(direction = -1, begin = 0, end = .6) +
scale_x_continuous(limits = c(0, 5000)) +
labs(subtitle = "Parameter space")
ggExtra::ggMarginal(p, type="density", size=10, alpha = .25, margins = "x")
\[ \beta_1 \sim \mathcal{N}(0\text{ msecs}, 510\text{ msecs}) \]
p <- tibble(intercept = rnorm(10000, mean = 1000, sd = 250),
slope = runif(10000, -10000, 10000)) %>%
ggplot(aes(x=intercept, y=slope)) +
geom_point(size = .25, colour = "transparent") +
geom_density_2d_filled(alpha = 0.25, show.legend = F) +
geom_density_2d(alpha = 0.5, show.legend = F, size = .15, color = "black") +
scale_fill_viridis_d(direction = -1, begin = 0, end = .6) +
scale_x_continuous(limits = c(0, 10000)) +
labs(subtitle = "Parameter space")
ggExtra::ggMarginal(p, type="density", size=10, alpha = .25, margins = "x")
\[ \beta_1 \sim \mathcal{N}(0\text{ msecs}, 100\text{ msecs}) \]
p <- tibble(intercept = rnorm(10000, mean = 1000, sd = 500),
slope = rnorm(10000, 0, 100)) %>%
ggplot(aes(x=intercept, y=slope)) +
geom_point(size = .25, colour = "transparent") +
geom_density_2d_filled(alpha = 0.25, show.legend = F) +
geom_density_2d(alpha = 0.5, show.legend = F, size = .15, color = "black") +
scale_fill_viridis_d(direction = -1, begin = 0, end = .6) +
scale_x_continuous(limits = c(0, 5000)) +
scale_y_continuous(limits = c(-300,1000), breaks = seq(-300, 900, 250)) +
labs(subtitle = "Parameter space")
ggExtra::ggMarginal(p, type="density", size=10, alpha = .25, margins = "both")
brms uses default priors that can be changed as appropriate.(flat) priors are worth specifying.Check defaults used earlier:
prior_summary(fit_brm)
| prior | class | coef | group |
|---|---|---|---|
| (flat) | b | ||
| (flat) | b | nptypesimple | |
| lkj(1) | cor | ||
| lkj(1) | cor | item | |
| lkj(1) | cor | ppt | |
| student_t(3, 1039.5, 235) | Intercept | ||
| student_t(3, 0, 235) | sd | ||
| student_t(3, 0, 235) | sd | item | |
| student_t(3, 0, 235) | sd | Intercept | item |
| student_t(3, 0, 235) | sd | nptypesimple | item |
| student_t(3, 0, 235) | sd | ppt | |
| student_t(3, 0, 235) | sd | Intercept | ppt |
| student_t(3, 0, 235) | sd | nptypesimple | ppt |
| student_t(3, 0, 235) | sigma |
p <- tibble(intercept = rstudent_t(10000, 3, 1039.5, 235),
slope = runif(10000, -2000, 2000)) %>%
ggplot(aes(x=intercept, y=slope)) +
geom_point(size = .25, colour = "transparent") +
geom_density_2d_filled(alpha = 0.25, show.legend = F) +
geom_density_2d(alpha = 0.5, show.legend = F, size = .15, color = "black") +
scale_fill_viridis_d(direction = -1, begin = 0, end = .6) +
scale_x_continuous(limits = c(0, 5000)) +
scale_y_continuous(limits = c(-1000, 1000), breaks = seq(-1000, 1000, 250)) +
labs(subtitle = "Parameter space")
ggExtra::ggMarginal(p, type="density", size=10, alpha = .25, margins = "both")
You will refit our model with more informative priors in the next script.
linear-models/exercises/mixed_effects_model_brms_with_priors.Rbeta <- as_draws_df(fit_brm) %>% pull(b_nptypesimple)
length(beta)
[1] 4000
beta[1:5]
[1] -34.0 -45.2 -33.0 -37.6 -23.7
ggplot(data = NULL, aes(x = beta)) +
geom_histogram(alpha = .55) +
labs(x = bquote("Estimated effect"~hat(beta)))
mean(beta)
[1] -33.2
ggplot(data = NULL, aes(x = beta)) +
geom_histogram(alpha = .55) +
labs(x = bquote("Estimated effect"~hat(beta))) +
geom_vline(xintercept = mean(beta), color = "darkred")
quantile(beta, probs = c(0.025, 0.975))
2.5% 97.5% -66.281 0.749
ggplot(data = NULL, aes(x = beta)) +
geom_histogram(alpha = .55, bins = 50, position = "identity", show.legend = F) +
labs(x = bquote("Estimated effect"~hat(beta))) +
geom_errorbarh(aes(y = 50, xmin = quantile(beta, probs = c(0.025))
, xmax = quantile(beta, probs = c(0.975))), color = "darkred", size = 1.5, ) +
annotate("text", x = 15, y = 55, label = "95% PI", colour = "darkred")
quantile(beta, probs = c(0.025, 0.975))
2.5% 97.5% -66.281 0.749
quantile(beta, probs = c(0.055, 0.945))
5.5% 94.5% -60.83 -6.53
ggplot(data = NULL, aes(x = beta)) +
geom_histogram(alpha = .55, bins = 50, position = "identity", show.legend = F) +
labs(x = bquote("Estimated effect"~hat(beta))) +
geom_errorbarh(aes(y = 75, xmin = quantile(beta, probs = c(0.055))
, xmax = quantile(beta, probs = c(0.945))), color = "darkred", size = 1.5, ) +
annotate("text", x = 15, y = 90, label = "89% PI", colour = "darkred")
mean(beta < 0)
[1] 0.973
linear-models/exercises/hypothesis_testing_post.Rbeta2 <- beta[beta < 0]
ggplot(data = NULL, aes(x = beta, fill = beta > 0)) +
geom_histogram(alpha = .55, bins = 50, position = "identity", show.legend = F) +
scale_fill_manual(values = c("darkred", "grey")) +
geom_vline(xintercept = 0, colour = "grey20", linetype = "dotted") +
labs(x = bquote("Estimated effect"~hat(beta)))
mean(beta < -10)
[1] 0.917
quantile(beta, probs = c(.025, .5, .975))
2.5% 50% 97.5% -66.281 -32.873 0.749
mean(beta < 0)
[1] 0.973
There is nothing really special about 0.
ggplot(data = NULL, aes(x = beta, fill = beta > -10)) +
geom_histogram(alpha = .55, bins = 50, position = "identity", show.legend = F) +
geom_vline(xintercept = -10, colour = "grey20", linetype = "dotted") +
scale_fill_manual(values = c("darkred", "grey")) +
labs(x = bquote("Estimated effect"~hat(beta)))
ggplot(data = NULL, aes(x = beta)) +
geom_histogram(alpha = .55, bins = 50, position = "identity", show.legend = F) +
labs(x = bquote("Estimated effect"~hat(beta)))
library(bayestestR) rope(beta, range = c(-5, 5))
# Proportion of samples inside the ROPE [-5.00, 5.00]: inside ROPE ----------- 2.18 %
ggplot(data = NULL, aes(x = beta)) +
geom_histogram(alpha = .55, bins = 50, position = "identity", show.legend = F) +
labs(x = bquote("Estimated effect"~hat(beta)))
library(bayestestR) rope(beta, range = c(-5, 5))
# Proportion of samples inside the ROPE [-5.00, 5.00]: inside ROPE ----------- 2.18 %
ggplot(data = NULL, aes(x = beta, fill = beta > -5 & beta < 5)) +
geom_histogram(alpha = .55, bins = 50, position = "identity", show.legend = F) +
geom_vline(xintercept = 5, colour = "grey20", linetype = "dotted") +
geom_vline(xintercept = -5, colour = "grey20", linetype = "dotted") +
scale_fill_manual(values = c("grey", "darkred")) +
labs(x = bquote("Estimated effect"~hat(beta)))
library(bayestestR) rope(beta, range = c(-10, Inf))
# Proportion of samples inside the ROPE [-10.00, Inf]: inside ROPE ----------- 6.05 %
ggplot(data = NULL, aes(x = beta, fill = beta > -10 & beta < Inf)) +
geom_histogram(alpha = .55, bins = 50, position = "identity", show.legend = F) +
geom_vline(xintercept = -10, colour = "grey20", linetype = "dotted") +
scale_fill_manual(values = c("grey", "darkred")) +
labs(x = bquote("Estimated effect"~hat(beta)))
sigma <- as_draws_df(fit_brm) %>% pull(sigma) delta <- beta / sigma
abs(mean(delta)) # very small effect (see Cohen's d)
[1] 0.142
rope(delta, range = c(-0.1, 0.1))
# Proportion of samples inside the ROPE [-0.10, 0.10]: inside ROPE ----------- 26.34 %
Complete script linear-models/exercises/hypothesis_testing_rope.R

\[ \text{BF} = \frac{p(H_1 \mid y)}{p(H_0 \mid y)} \]
dnorm(0, 0, 100)
[1] 0.00399
dnorm(0, mean(beta), sd(beta))
[1] 0.00361
dnorm(0, 0, 100) / dnorm(0, mean(beta), sd(beta))
[1] 1.11
pd <- tibble(x = seq(-200, 200, by = 10),
Prior = dnorm(x, 0, 100),
Posterior = dnorm(x, mean(beta), sd(beta))) %>%
pivot_longer(Prior:Posterior)
data_at_zero <- filter(pd, x == 0)
pr <- filter(data_at_zero, name == "Prior") %>% pull(value)
po <- filter(data_at_zero, name == "Posterior") %>% pull(value)
ggplot(pd, aes(x = x, y = value, colour = name)) +
geom_line() +
# geom_point(data = data_at_zero, aes(x = x, y = value),
# colour = "black", size = 2) +
# geom_label(data = data_at_zero, aes(x = x, y = value, colour = name, label = round(value, 3)),
# colour = "black", size = 3, position = position_dodge(.0001)) +
scale_color_colorblind("") +
geom_vline(xintercept = 0, linetype = "dotted") +
labs(y = "density", x = bquote(hat(beta))) #, subtitle = bquote("BF"==.(round(pr/po,2)))) +
dnorm(0, 0, 25)
[1] 0.016
dnorm(0, mean(beta), sd(beta))
[1] 0.00361
dnorm(0, 0, 25) / dnorm(0, mean(beta), sd(beta))
[1] 4.42
pd <- tibble(x = seq(-200, 200, by = 10),
Prior = dnorm(x, 0, 25),
Posterior = dnorm(x, mean(beta), sd(beta))) %>%
pivot_longer(Prior:Posterior)
data_at_zero <- filter(pd, x == 0)
pr <- filter(data_at_zero, name == "Prior") %>% pull(value)
po <- filter(data_at_zero, name == "Posterior") %>% pull(value)
ggplot(pd, aes(x = x, y = value, colour = name)) +
geom_line() +
# geom_point(data = data_at_zero, aes(x = x, y = value),
# colour = "black", size = 2) +
# geom_label(data = data_at_zero, aes(x = x, y = value, colour = name, label = round(value, 3)),
# colour = "black", size = 3, position = position_dodge(.0001)) +
scale_color_colorblind("") +
geom_vline(xintercept = 0, linetype = "dotted") +
labs(y = "density", x = bquote(hat(beta))) #, subtitle = bquote("BF"==.(round(pr/po,2)))) +
dnorm(0, 0, 2000)
[1] 0.000199
dnorm(0, mean(beta), sd(beta))
[1] 0.00361
dnorm(0, 0, 2000) / dnorm(0, mean(beta), sd(beta))
[1] 0.0553
pd <- tibble(x = seq(-200, 200, by = 10),
Prior = dnorm(x, 0, 2000),
Posterior = dnorm(x, mean(beta), sd(beta))) %>%
pivot_longer(Prior:Posterior)
data_at_zero <- filter(pd, x == 0)
pr <- filter(data_at_zero, name == "Prior") %>% pull(value)
po <- filter(data_at_zero, name == "Posterior") %>% pull(value)
ggplot(pd, aes(x = x, y = value, colour = name)) +
geom_line() +
# geom_point(data = data_at_zero, aes(x = x, y = value),
# colour = "black", size = 2) +
# geom_label(data = data_at_zero, aes(x = x, y = value, colour = name, label = round(value, 3)),
# colour = "black", size = 3, position = position_dodge(.0001)) +
scale_color_colorblind("") +
geom_vline(xintercept = 0, linetype = "dotted") +
labs(y = "density", x = bquote(hat(beta))) #, subtitle = bquote("BF"==.(round(pr/po,2)))) +
student_t(3, 0, 100)hypothesis() in the exercise: linear-models/exercises/hypothesis_testing_bf_1.Rlinear-models/exercises/hypothesis_testing_bf_2.Rhypothesis()H: there is no difference between conjoined and simple NPs
hypothesis(fit_brm, "nptypesimple = 0")
Hypothesis Tests for class b:
Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
1 (nptypesimple) = 0 -33.2 17.2 -66.3 0.75 NA NA
Star
1
---
'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
'*': For one-sided hypotheses, the posterior probability exceeds 95%;
for two-sided hypotheses, the value tested against lies outside the 95%-CI.
Posterior probabilities of point hypotheses assume equal prior probabilities.
H: simple NPs are faster than conjoined NPs
hypothesis(fit_brm, "nptypesimple < 0")
Hypothesis Tests for class b:
Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
1 (nptypesimple) < 0 -33.2 17.2 -61.4 -5.88 36 0.97
Star
1 *
---
'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
'*': For one-sided hypotheses, the posterior probability exceeds 95%;
for two-sided hypotheses, the value tested against lies outside the 95%-CI.
Posterior probabilities of point hypotheses assume equal prior probabilities.
H: simple NPs are more than 10 msecs faster than conjoined NPs
hypothesis(fit_brm, "nptypesimple < -10")
Hypothesis Tests for class b:
Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
1 (nptypesimple)-(-10) < 0 -23.2 17.2 -51.4 4.12 11.1
Post.Prob Star
1 0.92
---
'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
'*': For one-sided hypotheses, the posterior probability exceeds 95%;
for two-sided hypotheses, the value tested against lies outside the 95%-CI.
Posterior probabilities of point hypotheses assume equal prior probabilities.
“Models are devices that connect theories to data. A model is an instantiation of a theory […]” (Rouder, Morey, and Wagenmakers 2016, 2)
fit <- brm(outcome ~ predictor + (1|participant), data = data, family = gaussian())
fit <- brm(outcome ~ predictor + (1|participant), data = data, family = bernoulli())
fit <- brm(outcome ~ predictor + (1|participant), data = data, family = poisson())
fit <- brm(outcome ~ predictor + (1|participant), data = data, family = zero_inflated_poisson())
fit <- brm(outcome ~ predictor + (1|participant), data = data, family = cumulative()) # or acat, sratio
fit <- brm(outcome ~ predictor + (1|participant), data = data, family = lognormal())
fit <- brm(outcome ~ predictor + (1|participant), data = data, family = exgaussian())
So far we used a Gaussian model which showed a poor fit to the data.
Alternative models: skewed normal, shifted (log)normal, Wiener Diffusion models, ex-Gaussian, mixture of lognormal etc. (see Matzke and Wagenmakers 2009; Roeser et al. 2024).
\[y \sim \mathcal{N}(\mu, \sigma^2)\]
\[ \log(y) \sim \mathcal{N}(\mu, \sigma^2) \]
\[ \log(y - \exp(\text{ndt})) \sim \mathcal{N}(\mu, \sigma^2) \]
Accounts for a Minimum rt: The shift parameter represents the minimum possible rt, accounting for the fact that no response can be instantaneous.
Fit another probability model
file_dest <- "../plots/fit2data.pdf" include_graphics(file_dest)
loo() uses the probability calculations to approximate LOO-IC (i.e. Pareto smoothed importance sampling) (Vehtari, Gelman, and Gabry 2015, 2017).loo() uses the log posterior predictive densities: how likely is each data point given the distribution parameter estimates?log_lik(fit_brm) %>%
as_tibble() %>% mutate(sample = 1:n()) %>%
pivot_longer(-sample, names_to = "obs", values_to = "loglik") %>%
group_by(obs) %>%
summarise(M = mean(loglik),
var = var(loglik)) %>%
mutate(obs = as.numeric(gsub(pattern = "V", replacement = "", obs))) %>%
ggplot(aes(x = obs, y = M, ymin = M-var, ymax = M+var)) +
geom_pointrange(fatten = .5) +
labs(x = "Observations", y = "log-posterior predictive density (with variance)")
loo() uses the log posterior predictive densities: how likely is each data point given the distribution parameter estimates?log_lik(fit_brm) %>%
as_tibble() %>% mutate(sample = 1:n()) %>%
pivot_longer(-sample, names_to = "obs", values_to = "loglik") %>%
group_by(obs) %>%
summarise(M = mean(loglik),
var = var(loglik)) %>%
mutate(obs = as.numeric(gsub(pattern = "V", replacement = "", obs))) %>%
ggplot(aes(x = obs, y = M, ymin = M-var, ymax = M+var)) +
geom_pointrange(fatten = .5) +
labs(x = "Observations", y = "log-posterior predictive density (with variance)")
loo(fit_brm)
Computed from 4000 by 1036 log-likelihood matrix.
Estimate SE
elpd_loo -7141.4 40.8
p_loo 41.6 3.6
looic 14282.8 81.6
------
MCSE of elpd_loo is 0.1.
MCSE and ESS estimates assume MCMC draws (r_eff in [0.6, 1.5]).
All Pareto k estimates are good (k < 0.7).
See help('pareto-k-diagnostic') for details.# product of n factors one for each data point: with theta being all parameters
# for LOO CV we perform exclude each data point one at a time which is equivalent to multiplying posterior by factor 1/p(y_i \mid \theta)
# LOO posterior excluding i is p(\theta \ mid y_{-i}) = p(\theta \mid y) / p(y_i \mid \theta) using the the
# LOO distribution uses the posterior simulatios for \theta and giving each simulation a weight of 1 / p(y_i \mid \theta)
# Weighted simulations is used to approximate the predictice distribution of y_i (the held-out data point)
# $$p(\theta \mid y) \propto p(\theta) \prod^n_{i=1} p(y_i \mid \theta)$$
# pointwise density: # Average likelihood of every observation in training sample # this is done for each set of parameters sampled from the psoterior distribution # Average liklihood for each obsersation # Sum over all observations. # resulting in the log-pointwise-predictive-density (lppd) # effective number of parameters is the variance in log-likelihood for every observation: p_waic # WAIC: -2*(lppd * p_waic)
log_lik(fit_brm) %>%
as_tibble() %>% mutate(sample = 1:n()) %>%
pivot_longer(-sample, names_to = "obs", values_to = "loglik") %>%
group_by(obs) %>%
summarise(M = mean(loglik),
var = var(loglik)) %>%
mutate(obs = as.numeric(gsub(pattern = "V", replacement = "", obs))) %>%
ggplot(aes(x = obs, y = M, ymin = M-var, ymax = M+var)) +
geom_pointrange(fatten = .5) +
labs(x = "Observations", y = "log-posterior predictive density (with variance)")
loo(fit_brm)
Computed from 4000 by 1036 log-likelihood matrix.
Estimate SE
elpd_loo -7141.4 40.8
p_loo 41.6 3.6
looic 14282.8 81.6
------
MCSE of elpd_loo is 0.1.
MCSE and ESS estimates assume MCMC draws (r_eff in [0.6, 1.5]).
All Pareto k estimates are good (k < 0.7).
See help('pareto-k-diagnostic') for details.
elpd_loo: sum of means (expected log predictive density)p_loo: sum of varianceslooic: \(-2 \cdot (\)elpd_loo\(-\)p_loo\()\) (for deviance scale)log_lik(fit_brm) %>%
as_tibble() %>% mutate(sample = 1:n()) %>%
pivot_longer(-sample, names_to = "obs", values_to = "loglik") %>%
group_by(obs) %>%
summarise(M = mean(loglik),
var = var(loglik)) %>%
mutate(obs = as.numeric(gsub(pattern = "V", replacement = "", obs))) %>%
ggplot(aes(x = obs, y = M, ymin = M-var, ymax = M+var)) +
geom_pointrange(fatten = .5) +
labs(x = "Observations", y = "log-posterior predictive density (with variance)")
loo(fit_brm)
Computed from 4000 by 1036 log-likelihood matrix.
Estimate SE
elpd_loo -7141.4 40.8
p_loo 41.6 3.6
looic 14282.8 81.6
------
MCSE of elpd_loo is 0.1.
MCSE and ESS estimates assume MCMC draws (r_eff in [0.6, 1.5]).
All Pareto k estimates are good (k < 0.7).
See help('pareto-k-diagnostic') for details.
#difference between two deviances has a chi-squared distribution; factor of 2 scales it that way; also called the Bayesian deviance # N of parameters # number of different conjectures for causes of explanations of the data # How much iformation do we want the model to provide # how flexible is themodel in fitting the training samples # penalty term # expected distance between in-sample and out-of-sample deviance
elpd_loo and looic rarely have a direct interpretation (as opposed to loo_R2()): important are differences between models.p_loo is the effective number of parameters: how flexible is the model fit.log_lik(fit_brm) %>%
as_tibble() %>% mutate(sample = 1:n()) %>%
pivot_longer(-sample, names_to = "obs", values_to = "loglik") %>%
group_by(obs) %>%
summarise(M = mean(loglik),
var = var(loglik)) %>%
mutate(obs = as.numeric(gsub(pattern = "V", replacement = "", obs))) %>%
ggplot(aes(x = obs, y = M, ymin = M-var, ymax = M+var)) +
geom_pointrange(fatten = .5) +
labs(x = "Observations", y = "log-posterior predictive density (with variance)")
#loo_R2(mixturemodel) #bayes_R2(mixturemodel)
linear-models/exercises/model_comparison.Rmc <- readRDS("../models/model_comparison.rda")
mc$diff %>%
as.data.frame() %>%
rownames_to_column("model") %>%
select(model:se_elpd_loo) %>%
mutate(across(where(is.numeric), round, 1)) %>%
kable() %>%
kable_styling("striped", full_width = F) %>%
column_spec(1:5, width = "10em") %>%
add_footnote("`elpd_loo` is reported as $\\\\widehat{elpd}$ and the difference as $\\\\Delta\\\\widehat{elpd}$.")
brms isn’t more complicated than lme4 and comes with more flexibility.library(LaplacesDemon)
x <- seq(-10, 10, by = 0.1)
tibble("t0" = dnorm(x, mean=0, sd=1),
"t1" = dst(x, mu=0, sigma=1, nu = 1),
"t2" = dst(x, mu=0, sigma=1, nu = 5),
"t3" = dst(x, mu=0, sigma=2, nu = 5),
x = x) %>%
pivot_longer(-x) %>%
ggplot(aes(x = x, y = value, colour = name)) +
geom_line() +
scale_x_continuous(limits = c(-7, 7)) +
scale_color_colorblind("Parameter values",
breaks = paste0("t",0:3),
labels = c(bquote(mu==0*","~sigma==1),
bquote(mu==0*","~sigma==1*","~nu==1),
bquote(mu==0*","~sigma==1*","~nu==5),
bquote(mu==0*","~sigma==2*","~nu==5))) +
labs(x = "x", y = "density") +
theme(legend.justification = "top")
Amrhein, Valentin, David Trafimow, and Sander Greenland. 2019. “Inferential Statistics as Descriptive Statistics: There Is No Replication Crisis If We Don’t Expect Replication.” The American Statistician 73 (sup1): 262–70.
Annis, Jeffrey, Brent J Miller, and Thomas J Palmeri. 2017. “Bayesian Inference with Stan: A Tutorial on Adding Custom Distributions.” Behavior Research Methods 49: 863–86.
Baayen, R. Harald. 2008. Analyzing Linguistic Data. A Practical Introduction to Statistics Using R. Cambridge: Cambridge University Press.
Baayen, R. Harald, Douglas J. Davidson, and Douglas M. Bates. 2008. “Mixed-Effects Modeling with Crossed Random Effects for Subjects and Items.” Journal of Memory and Language 59 (4): 390–412.
Barr, Dale J, Roger Levy, Christoph Scheepers, and Harry J Tily. 2013. “Random Effects Structure for Confirmatory Hypothesis Testing: Keep It Maximal.” Journal of Memory and Language 68 (3): 255–78.
Bates, Douglas, Reinhold Kliegl, Shravan Vasishth, and Harald Baayen. 2015. “Parsimonious Mixed Models.” arXiv Preprint arXiv:1506.04967.
Bürkner, Paul-Christian. 2017. “Brms: An R Package for Bayesian Multilevel Models Using Stan.” Journal of Statistical Software 80: 1–28. https://doi.org/10.18637/jss.v080.i01.
———. 2018. “Advanced Bayesian Multilevel Modeling with the R Package brms.” The R Journal 10 (1): 395–411. https://doi.org/10.32614/RJ-2018-017.
———. 2019. “Bayesian Item Response Modeling in R with Brms and Stan.” arXiv Preprint arXiv:1905.09501.
Bürkner, Paul-Christian, and Matti Vuorre. 2019. “Ordinal Regression Models in Psychology: A Tutorial.” Advances in Methods and Practices in Psychological Science 2 (1): 77–101.
Carpenter, Bob, Andrew Gelman, Matthew D Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. “Stan: A Probabilistic Programming Language.” Journal of Statistical Software 76: 1–32.
Clayton, Aubrey. 2021. Bernoulli’s Fallacy: Statistical Illogic and the Crisis of Modern Science. Columbia University Press.
Cohn, J. 1988. Statistical Power Analysis for the Behavioral Sciences. Hillsdale, NJ: Lawrence Earlbam Associates.
Colquhoun, David. 2017. “The Reproducibility of Research and the Misinterpretation of p-Values.” Royal Society Open Science 4 (12): 171085.
Dickey, James M., B. P. Lientz, et al. 1970. “The Weighted Likelihood Ratio, Sharp Hypotheses about Chances, the Order of a Markov Chain.” The Annals of Mathematical Statistics 41 (1): 214–26.
Dienes, Zoltan. 2014. “Using Bayes to Get the Most Out of Non-Significant Results.” Frontiers in Psychology 5 (781): 1–17.
———. 2016. “How Bayes Factors Change Scientific Practice.” Journal of Mathematical Psychology 72: 78–89.
Dienes, Zoltan, and Neil Mclatchie. 2018. “Four Reasons to Prefer Bayesian Analyses over Significance Testing.” Psychonomic Bulletin & Review 25 (1): 207–18.
Eager, Christopher D, and Joseph Roy. 2017. “Mixed Models Are Sometimes Terrible.” Linguistic Society of America, Austin, TX.
Gelman, Andrew, Jennifer Hill, and Aki Vehtari. 2020. Regression and Other Stories. Cambridge University Press.
Gelman, Andrew, and Donald B. Rubin. 1992. “Inference from Iterative Simulation Using Multiple Sequences.” Statistical Science 7 (4): 457–72.
Greenland, Sander, Stephen J. Senn, Kenneth J. Rothman, John B. Carlin, Charles Poole, Steven N. Goodman, and Douglas G. Altman. 2016. “Statistical Tests, p Values, Confidence Intervals, and Power: A Guide to Misinterpretations.” European Journal of Epidemiology 31 (4): 337–50.
Hoekstra, Rink, Richard D. Morey, Jeffrey N. Rouder, and Eric-Jan Wagenmakers. 2014. “Robust Misinterpretation of Confidence Intervals.” Psychonomic Bulletin & Review 21 (5): 1157–64.
Jeffreys, Harold. 1961. The Theory of Probability. Vol. 3. Oxford: Oxford University Press, Clarendon Press.
Kimball, A, Kailen Shantz, C Eager, and Joseph Roy. 2016. “Beyond Maximal Random Effects for Logistic Regression: Moving Past Convergence Problems.” ArXiv e-Prints.
Kruschke, John K. 2010. “What to Believe: Bayesian Methods for Data Analysis.” Trends in Cognitive Sciences 14 (7): 293–300.
———. 2011. “Bayesian Assessment of Null Values via Parameter Estimation and Model Comparison.” Perspectives on Psychological Science 6 (3): 299–312.
———. 2014. “Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan.”
———. 2018. “Rejecting or Accepting Parameter Values in Bayesian Estimation.” Advances in Methods and Practices in Psychological Science 1 (2): 270–80.
Lambert, Ben. 2018. A Student’s Guide to Bayesian Statistics. Sage.
Lee, Michael D., and Eric-Jan Wagenmakers. 2014. Bayesian Cognitive Modeling: A Practical Course. Cambridge University Press.
Loken, Eric, and Andrew Gelman. 2017. “Measurement Error and the Replication Crisis.” Science 355 (6325): 584–85.
Makowski, Dominique, Mattan S Ben-Shachar, and Daniel Lüdecke. 2019. “ bayestestR: Describing Effects and Their Uncertainty, Existence and Significance Within the Bayesian Framework.” Journal of Open Source Software 4 (40): 1541.
Martin, Randi C, Jason E Crowther, Meredith Knight, Franklin P Tamborello II, and Chin-Lung Yang. 2010. “Planning in Sentence Production: Evidence for the Phrase as a Default Planning Scope.” Cognition 116 (2): 177–92.
Matzke, Dora, and Eric-Jan Wagenmakers. 2009. “Psychological Interpretation of the Ex- Gaussian and Shifted Wald Parameters: A Diffusion Model Analysis.” Psychonomic Bulletin & Review 16 (5): 798–817.
McElreath, Richard. 2020. Statistical Rethinking: A Bayesian course with examples in R and Stan. Vol. 2. CRC Press.
Morey, Richard D., Rink Hoekstra, Jeffrey N Rouder, Michael D. Lee, and Eric-Jan Wagenmakers. 2016. “The Fallacy of Placing Confidence in Confidence Intervals.” Psychonomic Bulletin & Review 23 (1): 103–23.
Nicenboim, Bruno, and Shravan Vasishth. 2016. “Statistical Methods for Linguistic Research: Foundational Ideas – Part II.” Language and Linguistics Compass 10 (11): 591–613.
Roeser, J., M. Torrance, M. Andrews, and T. Baguley. 2024. “No Default Syntactic Scope for Advance Planning in Sentence Production: Evidence from Finite Mixture Models,” December. https://doi.org/10.31219/osf.io/u7v36.
Rouder, Jeffrey N., Richard D. Morey, and Eric-Jan Wagenmakers. 2016. “The Interplay Between Subjectivity, Statistical Practice, and Psychological Science.” Collabra 2 (1): 1–12.
Sorensen, Tanner, S. Hohenstein, and Shravan Vasishth. 2016. “Bayesian Linear Mixed Models Using Stan: A Tutorial for Psychologists, Linguists, and Cognitive Scientists.” Quantitative Methods for Psychology 12 (3): 175–200.
Vasishth, Shravan, and Bruno Nicenboim. 2016. “Statistical Methods for Linguistic Research: Foundational Ideas – Part I.” arXiv Preprint arXiv:1601.01126.
Vehtari, Aki, Andrew Gelman, and Jonah Gabry. 2015. “Pareto Smoothed Importance Sampling.” arXiv Preprint arXiv:1507.02646.
———. 2017. “Practical Bayesian Model Evaluation Using Leave-One-Out Cross-Validation and WAIC.” Statistics and Computing 27 (5): 1413–32.
Winter, Bodo, and Paul-Christian Bürkner. 2021. “Poisson Regression for Linguists: A Tutorial Introduction to Modelling Count Data with Brms.” Language and Linguistics Compass 15 (11): e12439.